Similar Structures inside RDF-Graphs
نویسندگان
چکیده
RDF is the common data model to publish structured data on the Web. RDF data sets are given as subject-predicateobject triples and typically are represented as directed edgelabeled graphs. To make the information represented by such graphs comprehensible, RDF-schema (RDFS) provides concepts to define a class-structure as part of the given RDFgraph and thus supports a more abstract view on the data set. In this paper we follow a different approach and propose to make an RDF graph more comprehensible by reducing its size by partitioning to discover subgraphs which are similar with respect to their structure. The methods applied to derive a partition are based on bisimulation and agglomerative clustering. We demonstrate the usefulness of the approach by applying it on several synthetic and one real world RDF datasets.
منابع مشابه
Roman domination excellent graphs: trees
A Roman dominating function (RDF) on a graph $G = (V, E)$ is a labeling $f : V rightarrow {0, 1, 2}$ suchthat every vertex with label $0$ has a neighbor with label $2$. The weight of $f$ is the value $f(V) = Sigma_{vin V} f(v)$The Roman domination number, $gamma_R(G)$, of $G$ is theminimum weight of an RDF on $G$.An RDF of minimum weight is called a $gamma_R$-function.A graph G is said to be $g...
متن کاملGRAPHIUM: Visualizing Performance of Graph and RDF Engines on Linked Data
Graph size, density, and number of labels negatively impact on the performance of all the engines. Graph summarization seems to be more affected by the graph density and the number of labels. Dense graph is more influenced by the size of the graphs. RDF-3X outperforms the rest of the engines in pattern matching and graph creation. DEX seems to overcome the rest of the engines when the graphs ar...
متن کاملHPRD: A High Performance RDF Database
In this paper a high performance storage system for RDF documents is introduced. The system employs optimized index structures for RDF data and efficient RDF query evaluation. The index scheme consists of 3 types of indices. Triple index manages basic RDF triples by dividing original RDF graph into several sub-graphs. Path index manages frequent RDF path patterns for long path query performance...
متن کاملDefining and computing Least Common Subsumers in RDF
Several application scenarios in the Web of Data share the need to identify the commonalities between a pair of RDF resources. Motivated by such needs, we propose the definition and the computation of Least Common Subsumers (LCSs) in RDF. To this aim, we provide some original and fundamental reformulations, to deal with the peculiarities of RDF. First, we adapt a few definitions from Graph Theo...
متن کاملA Tool for Efficiently Processing SPARQL Queries on RDF Quads
We present a tool called RIQ (RDF Indexing on Quads) for efficiently processing SPARQL queries on large RDF datasets containing quads. RIQ’s novel design includes: (a) a vector representation of RDF graphs for efficient indexing, (b) a filtering index for efficiently organizing similar RDF graphs, and (c) a decrease-and-conquer strategy for efficient query processing using the filtering index t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013